Part-of-Speech is (almost) enough: SAP Research & Innovation at the #Microposts2014 NEEL Challenge
نویسندگان
چکیده
This paper describes the submission of the SAP Research & Innovation team at the #Microposts2014 NEEL Challenge. We use a two-stage approach for named entity extraction and linking, based on conditional random fields and an ensemble of search APIs and rules, respectively. A surprising result of our work is that part-of-speech tags alone are almost sufficient for entity extraction. Our results for the combined extraction and linking task on a development and test split of the training set are 34.6% and 37.2% F1 score, respectively, and for the test set is 37%.
منابع مشابه
Making Sense of Microposts (#Microposts2014) Named Entity Extraction & Linking Challenge
Microposts are small fragments of social media content and a popular medium for sharing facts, opinions and emotions. They comprise a wealth of data which is increasing exponentially, and which therefore presents new challenges for the information extraction community, among others. This paper describes the ‘Making Sense of Microposts’ (#Microposts2014) Workshop’s Named Entity Extraction and Li...
متن کاملThe Open University ’ s repository of research publications and other research outputs Making sense of microposts : ( # Microposts 2014 ) named entity extraction & linking challenge
Microposts are small fragments of social media content and a popular medium for sharing facts, opinions and emotions. They comprise a wealth of data which is increasing exponentially, and which therefore presents new challenges for the information extraction community, among others. This paper describes the ‘Making Sense of Microposts’ (#Microposts2014) Workshop’s Named Entity Extraction and Li...
متن کاملDataTXT at #Microposts2014 Challenge
In this paper we describe the approach taken for the “Making Sense of Microposts challenge 2014” (#Microposts2014), where participants were asked to cross reference micro-posts extracted from Twitter with DBpedia URIs belonging to a given taxonomy. For this task we deployed dataTXT which is the evolution of Tagme[3], the state-of-the-art topic annotator for short texts and which has proven to b...
متن کاملDesign and Implementation of an Intelligent Part of Speech Generator
The aim of this paper is to report on an attempt to design and implement an intelligent system capable of generating the correct part of speech for a given sentence while the sentence is totally new to the system and not stored in any database available to the system. It follows the same steps a normal individual does to provide the correct parts of speech using a natural language processor. It...
متن کاملA Study of the Features and Functions of speech Perseverance (With an Emphasis on the Alavi Teachings)
The serious challenge that contemporary human is encountered with has been brought about by the lack of applying ethical and behavioral necessities in his life rather than by the weakness of the rules or lack of technology. One of the mentioned important necessities is the factor of speech perseverance which has a particular conceptual and meaningful weight that is the adducing of the right spe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014